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Abstract 

In  this  paper  we  provide  econometric  tools  for  the  evaluation  of 
intertemporal  asset  pricing  models  using  specification-error  and  volatility 
bounds.  We  formulate  analog  estimators  of  these  bounds,  give  conditions  for 
consistency  and  derive  the  limiting  distribution  of  these  estimators.  The 
analysis  incorporates  market  frictions  such  as  short-sale  constraints  and 
proportional  transactions  costs.  Among  several  applications  we  show  how  to 
use  the  methods  to  assess  specific  asset  pricing  models  and  to  provide 
nonparametric  characterizations  of  asset  pricing  anomalies. 
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In  this  paper  we  provide  statistical  methods  for  assessing  asset-pricing 
models  using  specification-error  and  volatility  bounds.  The  statistical 
procedures  can  account  for  market  frictions  due  to  transactions  costs  or 
short-sale  constraints,  and  are  easier  to  interpret  than  standard  tests  of 
asset-pricing  models.  For  the  most  part  these  methods  are  quite  easy  to 
implement,  even  when  market  frictions  are  considered.  They  are  designed  to 
provide  a  better  understanding  of  the  statistical  failures  of  some  popular 
asset-pricing  models  and  to  offer  guidance  in  improving  these  models. 

Models  of  asset  pricing  with  frictionless  markets  imply  that  asset  prices 
can  be  represented  by  a  stochastic  discount  factor  or  pricing  kernel.  A 
stochastic  discount  factor  "discounts"  payoffs  in  each  state  of  the  world 
and,  as  a  consequence,  adjusts  the  price  according  to  the  riskiness  of  the 
payoff.  For  example,  in  the  Capital  Asset  Pricing  Model  the  discount  factor 
is  given  by  a  constant  plus  a  scale  multiple  of  the  return  on  the  market 
portfolio.  In  the  Consumption-Based  CAPM  the  discount  factor  is  given  by  the 
intertemporal  marginal  rate  of  substitution  of  an  investor. 

The  implications  of  particular  models  with  observable  (up  to  a  finite 
number  of  parameters)  stochastic  discount  factors  are  often  tested  by  looking 
directly  at  the  average  "pricing  errors"  of  the  models.  Formal  statistical 
tests  are  performed  using  a  time  series  of  portfolio  payoffs  and  prices  by 
examining  whether  the  sample  analogs  of  the  average  and  predicted  prices  are 
significantly  different  from  each  other.  For  examples  of  this  type  of 
procedure,  see  Hansen  and  Singleton  (1982),  Brown  and  Gibbons  (1985), 
MacKinlay  and  Richardson  (1991)  and  Epstein  and  Zin  (1991). 

While  tests  such  as  these  can  be  informative,  it  is  often  difficult  to 
interpret  the  resulting  statistical  rejections.  Further,  these  tests  are  not 
directly  applicable  when  there  are  market  frictions  such  as  transactions 
costs  or  short-sale  constraints.    Extending  the  models  to  allow  for  such 


frictions  entails  inequalities  instead  of  the  pricing  equalities  that  prevail 
in  frictionless  market  models  (e.g.,  see  Prisman  1986  and  Jouini  and  Kallal 
1993).  Finally,  these  tests  can  not  be  used  when  the  candidate  discount 
factor  depends  on  variables  unavailable  to  the  econometrician. 

As  an  alternative  to  considering  the  average  pricing  errors  of  a  model, 
we  consider  a  different  set  of  tests  and  diagnostics  using  the 
specification-error  bounds  of  Hansen  and  Jagannathan  (1993),  and  the 
volatility  bounds  of  Hansen  and  Jagannathan  (1991).  We  also  consider 
extensions  of  these  tests  and  diagnostics,  developed  by  He  and  Modest  (1993) 
and  Luttmer  (1994),  that  handle  transactions  costs,  short-sale  restrictions 
and  other  market  frictions.  We  develop  an  econometric  methodology  to  provide 
consistent  estimators  of  the  specification-error  and  volatility  bounds,  and 
an  asymptotic  distribution  theory  that  is  easy  to  implement  and  that  can  be 
used  to  make  statistical  inferences. 

Among  other  things,  the  results  in  this  paper  allow  one  to:  (i)  test 
whether  a  specific  model  of  the  stochastic  discount  factor  satisfies  the 
volatility  bounds  implicit  in  asset-market  returns;  (ii)  compare  the 
information  about  the  means  and  standard  deviations  of  discount  factors 
contained  in  different  sets  of  asset  returns;  and  (iii)  test  hypotheses  about 
the  size  of  possible  pricing  errors  of  misspecified  asset  pricing  models. 

While  Burnside  (1994)  and  Cecchetti,  Lam  and  Mark  (1993)  have  devised 
tests  of  models  of  the  discount  factor  along  the  lines  of  (i),  their  tests 
are  based  on  a  different  parameterization  of  the  volatility  bound.  Our 
parameterization  yields  tests  that  are  simpler  to  implement  and  can 
accommodate  market  frictions  in  a  straightforward  manner.  In  regards  to 
(ii),  our  results  permit  these  comparisons  among  data  sets  to  be  made 
independent  of  a  specific  stochastic  discount  factor  model.  Our  motivation 
for  (iii)  is  to  shift  the  focus  of  statistical  analyses  of  asset  pricing 


models  away  from  whether  the  models  are  correctly  specified  and  towards 
measuring  the  extent  to  which  they  are  misspecif ied. 

The  rest  of  the  paper  is  organized  as  follows.  In  Section  1  we  review 
the  specification  and  volatility  bounds  of  Hansen  and  Jagannathan  (1991, 
1993),  He  and  Modest  (1993)  and  Luttmer  (1994).  We  show  formally  that  the 
volatility  bound  can  be  viewed  as  a  special  case  of  the  specification-error 
bound.  This  permits  us  to  develop  the  underlying  econometric  tools  in  a 
unified  way.  In  Section  2  we  provide  consistency  and  asymptotic  distribution 
results  for  estimators  of  the  bounds.  In  Section  3  we  present  two 
applications  of  our  results  of  Section  2,  each  of  which  can  be  read 
independently.  Section  3.  A  shows  how  to  use  the  volatility  bounds  to  test 
models  of  the  discount  factor.  In  Section  3.B  we  extend  the  distribution 
theory  of  specification-error  bound  to  the  case  where  there  are  parameters  of 
the  discount  factor  proxy  that  are  unknown  and  must  be  estimated. 

Section  4  discusses  the  limiting  distribution  of  the  parameters 
underlying  the  bounds  both  with  and  without  market  frictions.  Among  other 
things,  these  results  can  be  used  to  determine  whether  the  volatility  bound 
is  degenerate  or  more  generally  whether  additional  security  market  data 
sharpens  the  bound.  Finally,  Section  5  describes  some  extensions  and 
provides  some  concluding  remarks. 

1.   General  Hodel  and  Bounds 

Our  starting  point  is  a  model  in  which  asset  prices  are  represented  by  a 
stochastic  discount  factor  or  pricing  kernel.  To  accommodate  security  market 
pricing  subject  to  transactions  costs,  we  permit  there  to  be  short-sale 
constraints  for  a  subset  of  the  securities.  Although  a  short-sale  constraint 
is  an  extreme  version  of  a  transactions  cost,  other  proportional  transactions 


costs  such  as  bid-ask  spreads  can  also  be  handled  with  this  formalism.  This 
is  done  as  in  Foley  (1970),  Jouini  and  Kallal  (1993 ^  and  Luttmer  (1994)  by 
constructing  two  payoffs  according  to  whether  a  security  is  purchased  or 
sold.  A  short-sale  constraint  is  imposed  on  both  artificial  securities  to 
enforce  the  distinction  between  a  buy  and  a  sell,  and  a  bid-ask  spread  is 
modeled  by  making  the  purchase  price  higher  than  the  sale  price. 

Suppose  the  vector  of  security  market  payoffs  used  in  an  econometric 
analysis  is  denoted  x.  The  vector  x  is  used  to  generate  a  collection  of 
payoffs  formed  using  portfolio  weights  in  a  closed  convex  cone  C   of  Rn: 

P     =      {p   :    p  =  a'x   for  some  a  €  C} .  (1) 

The  cone  C  is  constructed  to  incorporate  all  of  the  short-sale  constraints 
imposed  in  the  econometric  investigation.  If  there  are  no  market  frictions, 
then  C  is  Rn.  More  generally,  partition  x  into  two  components:  x'  = 
[x  '  ,x  ']  where  x  contains  the  k  components  not  subject  to  short-sale 
constraints   and  x        contains   the  I      components   subject   to   short-sale 

constraints.   Then  the  cone  C   is  formed  by  taking  the  Cartesian  product  of  R 

I 
and  the  nonnegative  orthant  of  R  . 

Let  q   denote  the  random  vector  of  prices  corresponding  to  the  vector  x 

of  securities  payoffs.   These  prices  are  observed  by  investors  at  the  time 

assets  are  traded  and  are  permitted  to  be  random  because  the  prices  may 

reflect  conditioning  information  available  to  the  investors.   In  the  absence 

of  short  sales  constraints,  prices  can  be  represented  by: 

q     =     E(mx|?)  (2) 

where  m     is  a  stochastic  discount  factor  and  ?  is  the  information  set 


available  to  investors  at  the  time  of  trade.  Since  it  is  difficult  to  model 
empirically  the  conditioning  information  available  to  investors,  we  instead 
work  with  the  average  or  expected  value  of  (2): 

Eq  -  Emx   =  0  .  (3) 

Some  conditioning  information  can  be  incorporated  in  the  usual  way  by 
multiplying  the  original  set  of  payoffs  and  prices  by  random  variables  in  the 
conditioning  information  of  economic  agents. 

More  generally  in  the  case  of  market  frictions,  x  is  partitioned  in  the 
manner  described  previously  and  the  pricing  is  represented  by: 

Eqn   -  Emx"     =     0  (4) 

Eqs   -  Emxs     £  0  . 

For  notational  simplicity  we  write  (4)  as: 

Eq  -  Emx   €  C  (5) 

where  the  elements  of  C  are  of  the  form  (0,0')',  (3  nonnegative.  In  the 
absence  of  frictions  we  take  C  to  contain  only  the  zero  vector  so  that  (5) 
encompasses  (3).  The  inequality  restriction  emerges  because  pricing  the 
vector  of  payoffs  xs  subject  to  short-sale  constraints  must  allow  for  the 
possibility  that  these  constraints  bind  and  hence  contribute  positively  to 
the  market  price  vector. 


l.A  Haintained  Assumptions 

There  are  three  restrictions  on  the  vector  of  payoffs  and  prices  that 
are  central  to  our  analysis.  The  first  is  a  moment  restriction,  the  second 
is  equivalent  to  the  absence  of  arbitrage  on  the  space  of  portfolio  payoffs, 
and  the  third  eliminates  redundancy  in  the  securities. 

For  pricing  relation  (5)  to  have  content,  we  maintain: 


Assumption   1.1:      £|x|   <«,  £|q|< 


Assumption    1.2:      For  any  a  €  C,    a.' Eq   >    0  if  a'x  £  0  and  Probia' x  >  0}  >  0. 
Further  for  a  €  C,    a' Eq   £  0  if  a'x  =  0. 

Recall  that  C    is  the  Cartesian  product  of  R  and  the  nonnegative  orthant  of 

I 
IR  ,  which  captures  the  short-sale  restrictions  on  some  of  the  securities. 

Assumption  1.2  is  a  statement  of  the  Principle  of  No-Arbitrage  applied  to 

expected  prices  and  modified  to  account  for  the  fact  that  C  need  not  be 

linear  [e.g.,  see  Kreps  (1981),  Prisman  (1986),  Jouini  and  Kallal  (1993),  and 

Luttmer  (1994)].   It  guarantees  that  there  exists  a  non-negative  stochastic 

discount  factor  m   with  finite  second  moment  such  that  (5)  holds.   Jouini  and 

Kallal  (1993)  discuss  additional  assumptions  that  imply  the  existence  of  a 

positive  discount  factor  satisfying  (5),  however  in  our  analysis  we  consider 

only  nonnegative  discount  factors. 

Next  we  limit  the  construction  of  x  by  ruling  out  redundancies  in  the 

securities: 

Assumption   1.3:      If  a'x  =  a  'x  and  a' Eg  =  a  ' Eq   for  some  a  and  a  in  C,    then 
a  =  a  . 


In  the  absence  of  transaction  costs,  Assumption  1.3  precludes  the  possibility 
that  the  second  moment  matrix  of  x  is  singular.  Otherwise,  there  would  exist 
a  nontrivial  linear  combination  of  the  payoff  vector  x  that  is  zero  with 
probability  one.  In  light  of  (5),  the  (expected)  price  of  this  nontrivial 
linear  combination  would  have  to  be  zero,  violating  Assumption  1.3.  To 
accommodate  securities  whose  purchase  price  differs  from  the  sale  price,  we 
permit  the  second  moment  matrix  of  the  composite  vector  x    to  be  singular. 

Assumption  1.3  then  requires  that  distinct  portfolio  weights  used  to 

2 
construct  the  same  payoff  must  have  distinct  expected  prices. 

l.B  Minimum-Distance  Problems 

There  are  two  problems  that  underlie  most  of  our  analysis.  Let  M  denote 
the  set  of  all  random  variables  with  finite  second  moments  that  satisfy  (5), 
and  let  M  be  the  set  of  all  nonnegative  random  variables  in  M.  Recall  that 
Assumption  1.2  implies  that  there  is  a  nonnegative  discount  factor  that 
satisfies  (5)  so  that  both  sets  are  nonempty.  Let  y  denote  some  "proxy" 
variable  for  a  stochastic  discount  factor  that,  strictly  speaking,  does  not 

satisfy  relations  (5).   Following  Hansen  and  Jagannathan  (1993),  we  consider 

3 
the  following  two  ad  hoc   least  squares  measures  of  misspecif ication: 


82     =  min   £[(y  -  m)2]  ,  (6) 


and 


mzM 


min   £[(y  -  m)2]    .  (7) 


meM 


Clearly,  the  specification-error  bound  implied  by  (7)  is  no  smaller  than  that 
implied  by  (6)  since  it  is  obtained  using  a  smaller  constraint  set.  The 
solutions  to  (6)  and  (7)  are  the  objects  we  are  interested  in  estimating  and 


making  inferences  about.  Sections  2  and  3  provide  large-sample 
justifications  for  the  solutions  to  sample  counterparts  to  these  optimization 
problems. 

By  setting  the  proxy  y  to  zero,  the  specification  error  problems 
collapse  to  finding  bounds  on  the  second  moment  of  stochastic  discount 
factors  as  constructed  by  Hansen  and  Jagannathan  (1991),  He  and  Modest  (1993) 
and  Luttmer  (1994).  In  particular,  the  bounds  derived  in  Hansen  and 
Jagannathan  (1991)  are  obtained  by  setting  y  to  zero  and  solving  (6)  and  (7) 
when  there  are  no  short-sale  constraints  imposed  (when  C  is  set  to  Rn);  the 
bound  derived  in  He  and  Modest  (1993)  is  obtained  by  solving  (6)  for  y  set  to 
zero;  and  the  bound  derived  by  Luttmer  (1994)  is  obtained  by  solving  (7)  for 
y  set  to  zero.  These  second  moment  bounds  will  subsequently  be  used  in 
deriving  feasible  regions  for  means  and  standard  deviations  of  stochastic 
discount  factors. 

In  solving  the  least  squares  problems  (6)  and  (7)  and  in  developing 
econometric  methods  associated  with  those  problems,  it  is  most  convenient  to 
study  the  conjugate  maximization  problems.   They  are  given  by 


52   =   max{Ey2  -   £[(y  -  x'a)2]  -  2a' Eg}   ,  (8) 

aeC 


and 


S2  =   max  {£y2  -  £[(y  -  x'a)+2]  -  2a' Eg}  (9) 


where  the  notation  h  denotes  max{h,0}.  The  conjugate  problems  are  obtained 
by  introducing  Lagrange  multipliers  on  the  pricing  constraints  in  (5)  and 
exploiting  the  familiar  saddle  point  property  of  the  Lagrangian.  The  a' s 
then  have  interpretations  as  the  multipliers  on  the  pricing  constraints. 

The  conjugate  problems  in  (8)  and  (9)  are  convenient  because  the  choice 


variables  are  finite-dimensional  vectors  whereas  the  choice  variables  in  the 
original  least  squares  problems  (6)  and  (7)  are  random  variables  that  reside 
in  possibly  infinite-dimensional  constraint  sets.  In  fact,  optimization 
problem  (8)  is  a  standard  quadratic  programming  problem.  The  specifications 
of  the  conjugate  problems  are  justified  formally  in  Hansen  and  Jagannathan 
(1991,  1993)  and  Luttmer  (1994).  In  Section  2  we  develop  the  properties  of 
estimators  of  S  and  S  based  on  time  series  sample  analogs  to  problems  (8) 
and  (9).  In  so  doing  we  rely  on  several  important  proprieties  of  the 
solutions  to  problems  (8)  and  (9)  and  on  an  additional  identification 
assumption. 

Notice  that  the  criteria  for  the  maximization  problems  are  concave  in  a 
and  that  the  first-order  conditions  for  the  solutions  are  given  by: 

Eq   -  £[(y  -  x'a)x]      e  C  (10) 

in  the  case  of  problem  (8)  and 

Eq  -  EUy  -  x'a)+x)      €  C  (11) 

in  the  case  of  problem  (9),  along  with  the  respective  complementary  slackness 
conditions.  Interpreting  the  first-order  conditions  for  these  problems, 
observe  that  associated  with  a  solution  to  problem  (8)  is  a  random  variable  m 
=  (y  -  x'oc)  in  M  and  associated  with  a  solution  to  problem  (9)  is  a 
nonnegative  random  variable  m  =  (y  -  x' a)*  in  M* .  These  random  variables  are 
the  unique  (up  to  the  usual  equivalence  class  of  random  variables  that  are 
equal  with  probability  one)  solutions  to  the  original  least  squares  problems 
(6)  and  (7). 

Consistency  of  the  estimators  of  5  and  5  relies  upon  the  fact  that  the 


sets  of  solutions  to  (8)  and  (9)  are  compact.  Compactness  in  the  case  of  (8) 
is  easily  established.  Since  Assumption  1.3  eliminates  redundant  securities 
and  the  random  variable  (y  -  x' a.)  is  uniquely  determined,  the  solution  a  to 
conjugate  problem  (8)  is  also  unique.  This  follows  because  the  value  of  the 
criterion  must  be  the  same  for  all  solutions,  implying  that  they  all  must 
have  the  same  expected  price  a' Eg.  The  solution  to  conjugate  problem  (9)  may 
not  be  unique,  however.  In  this  case  the  truncated  random  variable  (y  - 
x' a)  is  uniquely  determined,  as  is  the  expected  price  a' Eg.  On  the  other 
hand,  the  random  variable  (y  -  x'a)  is  not  necessarily  unique,  so  we  can  not 
exploit  Assumption  1.3  to  verify  that  the  solution  a  is  unique.  The  set  of 
solutions  is  convex  due  to  the  concavity  of  the  criterion  and  the  convexity 
of  the  constraint  set.   As  is  shown  in  Appendix  A,  it  is  also  compact. 

As  is  typical  in  asymptotic  distribution  theory,  in  Section  2  we  will 
need  an  identification  restriction  that  there  is  a  unique  solution  to  the 
conjugate  problems,  except  when  all  prices  are  constant.  Since  the  set  of 
solutions  is  convex,  local  uniqueness  implies  global  uniqueness.  To  display 
a  sufficient  condition  for  local  uniqueness,  let  x  denote  the  component  of 
the  composite  payoff  vector  x  for  which  the  pricing  relation  is  satisfied 
with  equality: 

Emx*     =  Eg*  (12) 

where  g  is  the  corresponding  price  vector.  Notice  that  in  addition  to  xn, 
x  may  contain  elements  of  xs.  Let  1,~  „,  be  the  indicator  function  for  the 
event  {m>0>.   A  sufficient  condition  for  local  uniqueness  is  that 

Assumption   1.4:      Ex  x   'l/~>0\  is  nonsingular. 


10 


To  see  why  this  is  a  valid  sufficient  condition,  observe  that  from  the 
complementary  slackness  conditions  we  know  the  multipliers  a  are  zero 
whenever  the  pricing  constraints  are  satisfied  with  a  strict  inequality.  As 
a  result  m   is  given  by  (y  -  x  '/3)  for  some  vector  J3,  and  consequently, 


E«     =  £^{S>o}  -Eixx,1Cm>o}^    ■  (13) 


When  the  matrix  £(x  x  'l/~sni)  is  nonsingular,  we  can  solve  (13)  for 


{m>0> 


l.C  Volatility  Bounds  and  Restrictions  on  Means 

The  second  moment  bounds  described  in  the  previous  subsection  can  be 
converted  into  standard  deviation  bounds  via  the  formulas: 

a     =      IS2  -    {Em)2]W2  (14) 

r~2     ,„  ,2,1/2 

o-     =      [8     -    (Em)    ] 

~2     ~2 
where  6  and  6  are  constructed  by  setting  the  proxy  to  zero.  Em   is  equal  to 

the  average  price  of  the  unit  payoff  when  trade  in  this  payoff  is  not  subject 

to  transaction  costs.    If  there  are  transactions  costs  or  if  no  data  is 

available  on  the  price  of  a  conditionally  riskless  payoff  then  Em   cannot  be 

identified.   In  these  circumstances,  volatility  bounds  can  still  be  obtained 

for  each  choice  of  Em   by  adding  a  unit  payoff  to  P  (augmenting  x  with  a  1) 

and  assigning  a  price  of  Em    to  that  payoff  (augmenting  Eq    with  Em).        In 

forming  the  augmented  cone,  there  should  be  no  short-sale  constraints  imposed 

on  the  additional  security.   Mean-specific  volatility  bounds  can  then  be 

obtained  using  (8),  (9)  and  (14). 

Although  Em   may  not  be  identified,  the  Principle  of  No-Arbitrage  does 

put  bounds  on  the  admissible  values  of  Em:    Em  €    [A  ,v  ]    where  A  is  the  lower 

ooo 


arbitrage  bound  and  v      is  the  upper  arbitrage  bound.    These  bounds  are 
computed  using  formulas  familiar  from  derivative  claims  pricing: 

A  =     -   infia'Eq   :  a  e  C   and  a'x  2  -1}  (15) 


v       s     infia'Eq   :  a  €  C   and  a'x   2  1}  (16) 


While  X     is  always  well  defined  via  (15),  v     may  not  be  because  there  may  not 

exist  a  payoff  in  P  that  dominates  a  unit  payoff.   In  such  circumstances,  we 

define  v     to  be  +00.   In  Section  2  we  show  how  to  consistently  estimate  X     and 
0  0 

v  .       Consistent  estimation  of  these  bounds  is  important  since  the  standard 
deviation  bound  <r  is  infinite  for  choices  of  Em   outside  these  bounds. 


2.   Estimation  of  the  Bounds 

In  this  section  we  develop  consistency  and  asymptotic  distribution 
results  for  the  specification-error  bounds  presented  in  Section  1.  A  key 
presumption  underlying  our  analysis  is  that  the  data  on  asset  payoffs  and 
prices  are  replicated  over  time  in  some  stationary  fashion.  That  is, 
associated  with  the  composite  vector  (x'.q'.y)'  is  a  stochastic  process 
{ (x.  '  ,q  ' ,y  )' }  whose  sequence  of  empirical  distributions  approximate  the 
joint  distribution  of  (x'.q'.y)'.  We  denote  integration  with  respect  to  the 
empirical  distribution  for  sample  size  T  as  T.  More  precisely,  for  any  z 
that  is  a  (Borel  measurable)  function  of  (x'  ,q'  ,y)  with  a  finite  first 
moment,  we  will  approximate  £z  by  T.z   where 

j;z   =   H/T)£=izt  .  (17) 


\?. 


Among  other  things,  we  require  that  this  approximation  becomes  arbitrarily 
good  as  the  sample  size  T  gets  large.  That  is  we  presume  that  {z  }  obeys  a 
Law  of  Large  Numbers.   A  sufficient  condition  for  this  is: 

Assumption    2.1:        The  composite  process  i(x   ' ,q  ' ,y    )}     is  stationary  and 
ergodic. 

4 
Under  this  assumption,  we  can  think  of  (x'.q'.y)  as  (x  '  ,q  '  ,y  ). 

To  estimate  the  specification-error  bounds,  we  suppose  that  a  sample  of 

size  T  is  available  and  that  the  empirical  distribution  implied  by  this  data 

is  used  in  place  of  the  population  distribution.   [Thus  we  are  applying  the 

Analogy  Principle   of  Goldberger  (1968)  and  Manski  (1988)].   We  introduce  two 

random  functions  4>   and  #: 


>(a)  =     y2  -    (y  -  a'  x)2  -  2a'q,  (18) 


and 


i(a)  =     y2  -    (y  -  a'x)+2  -  2a'q  .  (19) 


(d  )2  =    max     y[i(a)]  (20) 

T  tt€C 


and 
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(d  )2  =    max     T   [0(a)]  .  (21) 

T        aeC 


2.  A  Consistent  Estimation  of  the  Specification-Error  Bounds 

We   first   establish   the   statistical   consistency  of   the   estimator 
sequences  {d  >  and  {d  }: 


almost  surely  to  5  and  <5,  respectively. 

The  proof  of  this  proposition  is  given  in  Appendix  A  and  is  not  complicated 
by  the  presence  of  short-sale  constraints.  The  basic  idea  is  that  the 
population  and  sample  criterion  functions  for  the  conjugate  problems  are 
concave  and  the  sets  of  maximizers  are  convex.  By  Assumptions  1.1  and  2.1, 
the  criterion  functions  converge  pointwise  (in  a  and  6)  almost  surely  to  the 
population  criterion  functions  introduced  in  Section  l.B.  In  light  of  the 
concavity  of  the  criterion  functions,  this  convergence  is  uniform  on  compact 
sets  almost  surely  [for  example,  see  Rockafellar  (1970)].  Finally,  since  the 
sets  of  maximizers  of  the  limiting  criterion  functions  are  compact,  for 
sufficiently  large  T  one  can  find  a  compact  set  such  that  the  maximizers  of 
the  sample  and  population  criteria  are  contained  in  that  compact  set  [for 
example,  see  Hildenbrand  (1974)  and  Haberman  (1989)].  Hence  the  conclusion 
follows  from  the  uniform  convergence  of  the  criteria  on  a  compact  set. 

2.B  Asymptotic  Distribution  of  the  Estimators  of  the  Bounds 

We  consider  next  the  limiting  distribution  of  the  analog  estimator 
sequences  of  the  specification-error  bounds.  Our  ability  to  express  the 
objects  of  interest  as  solutions  to  the  conjugate  problems  permits  us  to 
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obtain  results  very  similar  to  those  in  the  literature  on  using  likelihood 
ratios  as  devices  for  model  selection  in  environments  when  models  are 
possibly  misspecified  [for  example,  see  Vuong  (1989)].  We  show  that  when  the 
specification  error  bounds  are  positive,  we  obtain  a  limiting  distribution 
that  is  equivalent  to  the  one  obtained  by  ignoring  parameter  estimation,  and 

when  the  specification  error  bound  is  zero  the  limiting  distribution  is 

degenerate.   [See  Theorem  3.3  of  Vuong  (1989)  page  307  for  the  corresponding 

result  for  likelihood  ratios.] 

Let  a  be  a  maximizer  of  T  $>    a    a  maximizer  of  E<p,    a  a  maximizer  of 

T  (j>,     and  a  a  maximizer  of  £0.    To  study  the  limiting  behavior  of  the 

estimators,  we  use  the  decompositions: 


\/T[(dT)2  -  I2)    =   •T^.[0(aT)  -  i(a)]    +  /T^liu)  -  E*U)]  ,  (22) 


and 


•T[(dT)2  -  52]  =  /I^[$(ZT)  -  $(£)]  +  /T^tfU)  -  £0(a)]  .    (23) 


We  make  the  following  assumptions: 


Assumption  2.2: 


**, 


converges  in  distribution  to  a 


*(a)  -  E*(a) 
[{mx  -  q)    -  Eimx  -  q)] 
normally  distributed  random  vector  with  mean  zero  and  covariance  matrix  V. 


Assumption  2.  3: 


y% 


converges  in  distribution  to 


^(a)  -  £0(a) 
[(mx  -  q)  -  E(mx  -   q)]_ 
a  normally  distributed  random  vector  with  mean  zero  and  covariance  matrix  V. 


More  primitive  assumptions  that   imply  the  central   limit  approximations 


underlying  Assumptions  2.2  and  2.3  are  given  by  Gordin  (1969)  and  Hall  and 
Heyde  (1980). 

Let  u  denote  a  selection  vector  with  a  one  in  its  first  position 
followed  by  k  +  I  zeros.  The  limiting  distributions  for  the 
specification-error  bound  estimators  are  given  by  Proposition  2.2: 

Proposition  2.2:      Suppose  that  5*0  and  6*0.   Under  Assumptions  1.1  -  1.3, 
2.1  -  2.2,  WT[d      -    8]}    converges  to  a  normally  distributed  random  vector 


2.3,  WT[d     -  5]>  converges  in  distribution  to  a  normally  distributed  random 


As  Proposition  2.2  indicates  the  limiting  distributions  for  the  maximized 
values  depend  only  on  the  second  terms  of  the  decompositions  in  (22)  and 
(23).  In  other  words,  the  impact  of  replacing  the  unknown  population 
maximizers  by  the  sample  maximizers  in  the  sample  criterion  functions  is 
negligible.  As  a  consequence  the  presence  of  short-sale  constraints  does  not 
complicate  the  limiting  distribution. 

To  see  why  the  asymptotic  distribution  in  Proposition  2.2  is  not 

affected  by  sampling  error  in  the  estimation  of  the  multipliers,  consider  the 

~  2  ~ 

case  of  the  sequence  {(d)  }.   By  the  concavity  of  <f>,    we  have  the  following 


gradient  inequalities: 


(24) 


[(mx   -  q)    -  E(mx  -  q)]-(a^   -  a) 
+  E(mx  -  q) ■ (a  -  a)  . 


However,   it   follows   from   the   first-order   conditions   (including   the 
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complementary  slackness  conditions)  for  the  population  conjugate  problem  that 


(25) 


constrained  to  be  in  C.      Combining  (24)  and  (25)  we  have  that 


0     £     /TEjtfU^    -  0(a)]  (26) 

£     VT^Umx   -  q)    -  E(mx   -  qU-(i     -  a) 

Since,  by  Assumption  2.3,  the  sample  counterparts  of  the  pricing  errors  obey 
a  Central  Limit  Theorem,  {v'TY  [0(a  )  -  0(a)  ]>  converges  in  probability  to 
zero  if  the  maximizers  can  be  chosen  so  that  {(a  -  a)}  converges  almost 
surely  to  zero.  This  latter  convergence  can  be  demonstrated  by  exploiting 
the  concavity  of  the  population  criterion  function  and  the  convexity  of  the 
constraint  set  [for  example,  see  the  discussion  on  page  1635  of  Haberman 
(1989)  and  Appendix  A]. 

To  use  Proposition  2.2  in  practice  requires  consistent  estimation  of 
u'Vu  or  u'Fu.  Consider  the  case  of  u'Vu.  For  each  T  form  the  scalar 
sequence  {0  (a  ):  t=l,2,3,  ...  T>  and  use  one  of  the  frequency  zero  spectral 
density  estimators  described  by  Newey  and  West  (1987)  or  Andrews  (1991),  for 
example. 

As  is  shown  in  Appendix  A,  when  the  price  vector  q  is  a  vector  of  real 
numbers  (degenerate  random  variables),  the  asymptotic  distribution  for  {v^Tld 
-  6]>  remains  valid  even  when  the  population  version  of  the  conjugate  maximum 
problem  fails  to  have  a  unique  solution  (Assumption  1.4  is  violated).  In 
this  case,  the  lack  of  identification  of  the  parameter  vector  a  does  not 
alter  the  distribution  theory  for  the  specification-error  bound.  While  this 
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special  case  is  of  considerable  interest,  it  rules  out  the  possibility  of 
using  conditioning  information  to  form  synthetic  payoffs  as  described  in 
Section  1. 

Notice  that  if  5  =  0  or  8  =  0,  Proposition  2.2  breaks  down.  This  occurs 
if  y  is  a  valid  stochastic  discount  factor  in  which  case  the  solutions  to  the 
population  conjugate  problems  are  a  =  a  =  0.   As  a  consequence,  <f>   (a)  and 

4>   (a)  are  both  identically  zero  giving  rise  to  a  degenerate  limiting 

"2  ~  2 

distribution  for  Wl(d   )  }  and  Wl(d^)    }.       Our  results  in  Section  4  on  the 

convergence  of  the  parameter  estimators  can  be  used  to  establish  that  the 

"2        ~  2 
rate  of  convergence  of  {(d)  >  and  {(d)  }  is  T,  and  is  given  by  a  weighted 

utions  (see  Vu 

converge  at  the  rate  v'T,  although  the  limiting  distribution  is  not  normal. 

2.C  Consistent  Estimation  of  the  Arbitrage  Bounds 

As  we  discussed  in  Section  1,  the  second  moments  bounds  can  be  converted 

into  standard  deviation  bounds  if  the  mean  of  m    is  known  or  if  it  can  be 

estimated  using  the  price  of  a  risk-free  asset.   When  Em   is  not  known  it  must 

be  prespecif ied.   Let  v   be  the  hypothesized  mean  of  m   when  a  risk  free  asset 

is  not  available.    Proposition  2.1   can  be  applied   to  establish  the 

consistency  of  the  second  moment  bound  estimators  for  each  admissible   price 

assignment  v.       In  the  case  of  5  ,  for  the  price  assignment  to  be  admissible, 

it  must  not  induce  arbitrage  opportunities  onto  the  augmented  collection  of 

asset  payoffs  and  prices.   Any  price  (mean)  assignment  in  the  open  interval 

(A  ,d  )  is  admissible  in  this  sense, 
o  o 

The  final  question  we  explore  in  this  section  is  whether  the  arbitrage 
bounds,  A  and  u  given  in  (15)  and  (16),  can  be  consistently  estimated  using 
the  sample  analogs: 


t       =     -infia'Yq    :  a  €  C   and  (27) 

a'x  *  -1  for  all  t-1,2, . . . ,T> 
and 

u   =  infia'Yq   :    a   €  C   and  (28) 

a'x  i  1  for  all  t-1,2 T>  . 

The  estimated  upper  arbitrage  bound  u  is  always  finite  when  there  is  a 
payoff  on  a  limited  liability  security  that  is  never  observed  to  be  zero  in 
the  sample.  Our  estimated  range  of  the  admissible  values  for  the  (average) 
price  of  a  unit  payoff  and  hence  mean  of  n  is  [I  ,u  ].  Notice  that  these 
bounds  can  be  computed  by  solving  simple  linear  programming  problems.  In 
Appendix  A  we  prove: 


almost  surely.   If  v     is  finite,  then  {u  }  converges  to  v     almost  surely;  and 


2.D   Consistent  Estimation  of  the  Feasible  Region  of  Means  and  Standard 
Deviations 

We  now  consider  consistent  estimation  of  the  set  of  feasible  means  and 
standard  deviations  of  stochastic  discount  factors.  Previously  we  showed 
that  for  a  given  mean  of  the  stochastic  discount  factor,  the  standard 
deviation  bound  can  be  consistently  estimated.  However,  the  mean  of  the 
stochastic  discount  factor  typically  is  not  known.  As  a  result  it  is 
important  to  understand  the  sense  in  which  the  entire  feasible  region  can  be 
approximated.  Such  a  region  can  be  computed  with  or  without  imposing  the 
no-arbitrage  restriction  that  the  stochastic  discount  factors  be  positive. 
Let  S  denote  the  feasible  region  without  positivity  and  S*  the  (closure)  of 


the  feasible  region  with  positivity   Similarly,  let  S  and  the  S*  denote  the 

sample  counterparts.   The  question  we  now  turn  fco  is  in  what  sense  are  S  and 

S*  good  approximations  to  S  and  S  ? 

When  there  is  a  unit  payoff,  all  four  feasible  regions  are  vertical  rays 

in  mean  and  standard  deviation  space  because  the  (average)  price  of  this 

payoff  is  the  mean  discount  factor.   In  this  case  the  points  of  origin  of  the 

rays  S  and  S*  can  be  estimated  consistently  by  the  points  of  origin  of  the 
o      o 

corresponding  rays  S  and  S  . 

In  the  more  usual  case  when  data  on  the  price  of  a  unit  payoff  is  not 
available,  matters  are  a  little  more  complicated.  The  feasible  regions  are 
no  longer  vertical  rays  but  instead  are  unions  of  such  rays  resulting  in 
convex  sets  with  nonempty  interiors.  The  boundaries  of  these  sets  can  be 
represented  as  (possibly  extended)  real-valued  functions  of  the  ordinate 
(hypothetical  mean),  and  our  previous  analysis  implies  pointwise  (in  the 
mean)  convergence  of  the  sample  analog  functions  to  their  population 
counterparts.  This  result  implies  uniform  convergence  of  the  sample  analog 
functions  in  following  sense. 

Since  the  lower  and  upper  arbitrage  bounds  can  be  consistently 
estimated,  for  large  enough  T,  the  sample  analog  functions  under  positivity 
are  finite  on  any  compact  subset  of  (A  ,v  ).  When  positivity  is  ignored  the 
functions  are  finite  on  any  compact  subset  of  R.  Further  these  functions  are 
convex  functions  of  the  hypothetical  mean  of  the  discount  factor.  As  a 
result  [see  Theorem  10.8  of  Rockafellar  (1970)]  the  sample  analog  functions 
converge  uniformly,  almost  surely,  on  any  compact  subset  of  (A  ,v  )  in  the 
case  of  positivity  and  on  any  compact  set  when  positivity  is  ignored.  One 
difficulty  is  that  the  approximations  deteriorate  as  the  mean  assignment,  v, 
approaches  the  arbitrage  bounds  in  the  case  of  positivity,  or  when  v  gets 
large  when  positivity  is  ignored. 


arbitrage  bound  turns  out  not  to  be  problematic.  To  see  this,  instead  of 
viewing  the  boundaries  of  the  feasible  regions  as  functions  of  the  ordinate, 
we  explore  the  approximation  error  from  a  set-theoretic  vantage  point  in  (R  . 
Consider  first  the  case  in  which  v  <  +00.  Associated  with  a  sample  of  size  T 
is  an  approximation  error  as  measured  by  the  Hausdorff  metric: 


max<n(S  ,S   ),tt(S  ,S   )>  (29) 

TO  TO 


where: 


n(K  ,K  )      =         sup  inf  \(v  ,w  )-lv,w)\  (30) 

12  (v   ,w    )eK        (v   ,w    )€K  1      1 
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Measuring  the  approximation  error  via  the  Hausdorff  metric  allows 
ordered  pairs  to  get  close  without  restricting  them  to  have  the  same 
ordinate.  In  other  words,  we  no  longer  confine  our  attention  to  "vertical" 
measures  of  distance,  as  is  the  case  when  we  view  the  boundaries  of  the 
feasible  regions  as  functions  of  the  hypothetical  (expected)  prices  of  a  unit 
payoff.  The  added  flexibility  in  the  Hausdorff  metric  permits  us  to  exploit 
better  the  consistent  estimation  of  the  upper  and  lower  arbitrage  bounds 
(Proposition  2.3) . 

When  v     is  infinite,  the  approximation  error  t>  defined  by  (29)  will  be 


infinite.   As  a  remedy,  we  replace  n   by 


re  (C  ,C  )  =    sup  inf  |(v  ,w   )-(v  ,w   )|         (31) 

P       *   2       (..      ...    \~v  (.,      ...    \^v  1   1      2   2 


Osv   sp 


where  p  is  any  arbitrary  positive  number  greater  than  the  lower  arbitrage 


Proposition  2.4.  Under  Assumptions  1.1-1.3  and  2.1,  {y  >  converges  to  zero 
almost  surely. 

This  proposition  follows  since  the  arbitrage  bounds  can  be  consistently 
estimated  (Proposition  2.3)  and  the  lower  boundaries  of  {S  }  approach  the 
lower  boundary  of  S  uniformly  on  any  compact  interval  within  the  arbitrage 
bound.  This  latter  convergence  follows  from  Proposition  2.1  and  the 
convexity  of  the  boundary. 

3.   Applications 

In  this  section  we  discuss  two  applications  of  the  analysis  of  Section 
2.  First  we  show  how  the  feasible  regions  for  the  means  and 
standard-deviations  can  be  used  to  test  a  specific  model  of  the  discount 
factor.  Burnside  (1994)  and  Cecchetti,  Lam  and  Mark  (1993)  have  developed  a 
version  of  this  test  when  there  are  no  assets  subject  to  short-sale 
constraints  or  transactions  costs.  We  demonstrate  how  this  test  can  be 
implemented  in  a  relatively  simple  manner  by  exploiting  the  results  of 
Section  2.  Further  we  formulate  the  test  so  that  it  is  also  applicable  when 
there  are  assets  subject  to  short-sale  constraints.  As  a  result  this 
provides  (large-sample)  statistical  foundation  to  the  tests  of  asset  pricing 
models  suggested  by  He  and  Modest  (1993),  and  Luttmer  (1994). 

Second  we  outline  an  extension  of  the  specification-error  bound  analysis 
that  is  useful  when  the  discount  factor  proxy  under  consideration  depends 
upon  a  vector  of  unknown  parameters.  We  consider  how  these  results  can  be 
used  to  select  between  two  nonnested  models  by  comparing  the  minimized  values 


of  the  specification-errors. 

3. A   Testing  a  Specific  Hodel  of  the  Discount  Factor  using  Volatility 
Bounds 

Suppose  that  in  addition  to  asset-market  data,  a  model  of  the  discount 
factor  is  posited  and  a  time  series  of  observations  of  the  discount  factor  is 

available:   {m   :   t=l T}.   One  way  to  test  the  model  is  to  examine 

whether  it  satisfies  the  volatility  bounds  discussed  in  Sections  1  and  2. 
Since  observations  of  the  discount  factor  are  available,  the  average  price  of 
a  unit  payoff  can  be  estimated  by  the  mean  of  m.  Specifically,  form  x  by 
augmenting  the  original  vector  of  payoffs  with  a  unit  payoff;  form  q  by 
augmenting  the  original  vector  of  prices  with  the  random  variable  m;  and  form 
C  by  constructing  the  the  Cartesian  product  of  the  original  cone  with  R.  In 
effect,  we  have  added  a  unit  payoff  with  an  average  price  m  that  is  not 
subject  to  a  short-sale  constraint.  In  forming  a  test,  we  can  apply  the 
results  of  Section  1  and  2.B  with  one  minor  modification.    The  random 

functions  4>    and  4>    are  now  constructed  by  setting  the  proxy  y  to  zero  and 

2 
subtracting  m   : 


i(ct)      =     -    (-a'x)2  -  2a'q  -  m      ,  (32) 


and 


0(a)   =   -  (-a'x)*2  -  2a' q  -  m2      .  (33) 


2 

Subtracting  m      does  not  alter  the  solutions  to  either  the  sample  or 

population  maximization  problems.  It  does,  however,  change  the  maximized 

values  of  the  criteria  functions.  The  volatility  bounds  for  Em    will  be 


satisfied,  if,  and  only  if 


£  =     max     E<p(a)      s  0,  (34) 

aeC 


when  positivity  is  ignored,  or 


max     E<t>(cc)      £     0,  (35) 

aeC 


when  positivity  is  imposed.  The  limiting  distribution  reported  in 
Proposition  2.2  (appropriately  modified)  can  be  applied  to  construct  a  test 
of  these  hypotheses  using  sample  analog  estimators  of  £  and  £.  Again,  we 
have  formulated  the  problem  so  that  approximation  error  due  to  parameter 
estimation  plays  no  role  in  the  limiting  distributions  for  these  sample 
analogs. 

In  practice  we  find  the  solutions  for  the  sample  maximization  problems, 
estimate  the  asymptotic  standard  errors,  and  form  one-sided  tests.  In 
particular,  let  c  be  the  maximized  value  of  F  (0)  over  the  constraint  set  C. 
Then  Wl[c  -  £•]  converges  in  distribution  to  a  normal  random  variable  with 
mean  zero  and  variance  u'l^u.  This  variance  can  be  estimated  in  the  manner 
described  in  Section  2.B.  Since  £  is  not  specified  under  the  null  hypothesis 
(35),  the  "conservative"  choice  of  £,  =  0  is  used  in  constructing  the  test 
statistic.7 

Similar  approaches  to  testing  a  model  of  the  discount  factor  can  be 
applied  when  a  time  series  for  m  can  be  constructed  from  simulated  data 
instead  of  actual  data.  In  this  case  the  randomness  of  0(a)  can  be 
decomposed  additively  into  two  components,  one  due  to  the  randomness  of  the 
security  market  payoffs  and  prices  and  the  other  due  to  the  simulation  of  m. 
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As  in  the  work  of  McFadden  (1989),  Pakes  and  Pollard  (1989),  Lee  and  Ingram 
(1991)  and  Duffie  and  Singleton  (1993),  the  asymptotic  variance  in  the 
limiting  distribution  will  now  have  an  extra  component  due  to  the  sampling 
error  induced  by  simulation.  For  an  example  of  this  approach  and  a  more 
extensive  discussion  see  Heaton  (1993). 

When  the  first  two  moments  of  m    can  be  computed  numerically  with  an 

arbitrarily  high  degree  of  accuracy,  we  can  proceed  as  follows.   Augment  the 

2 
price  vector  with  Em   instead  of  m   and  subtract  Em     from  the  criteria  instead 

2 

of  m     as  in  (32)  and  (33).   This  same  strategy  can  be  employed  to  assess  the 

accuracy  of  the  estimated  feasible  region  for  means  and  standard  deviations 
of  stochastic  discount  factors.  For  any  hypothetical  mean-standard  deviation 
pair  for  m,  one  can  compute  the  corresponding  test  statistic  and  probability 
value. 

3.B  Minimizing  the  Specification-Error  Bound  for  Parameterized  Families  of 
Models 

Recall  that  the  specification-error  bounds  provide  a  way  to  assess  the 
usefulness  of  an  asset  pricing  model  even  when  it  is  technically 
misspecif ied.  In  many  situations  the  discount  factor  proxy  depends  on 
unknown  parameters.  For  example,  in  a  representative  consumer  model  with 
constant  relative  risk  aversion  preferences,  the  pure  rate  of  time  preference 
and  the  coefficient  of  relative  risk  aversion  are  typically  unknown.  In  this 
case  one  way  to  estimate  the  parameters  of  the  model  is  to  minimize  the 
specification  error.  Alternatively,  in  an  observable  factor  model,  the 
discount  factor  proxy  depends  on  a  linear  combination  of  the  factors  with 
unknown  coefficients.  As  in  the  work  of  Shanken  (1987)  one  could  imagine 
selecting  factor  coefficients  to  minimize  the  specification  error.  We  now 
sketch  how  the  results  of  Section  2.B  extend  in  a  straightforward  manner  to 
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obtain  a  distribution  theory  for  the  minimized  value  of  the 
specification-error  bound.  In  Section  4  we  discuss  distribution  theory  for 
the  resulting  estimators  of  the  parameters  of  the  discount  factor. 

Suppose  that  the  discount  factor  proxy  y  depends  on  the  parameter  vector 
0  €  B  where  B  is  a  compact  set.  The  population  optimization  problems  of 
interest  are  now: 


5^  =     min     max    |£y(/3)2  -  E{{y{B)   -  x'a)2}   -  2a' Eq\  ,         (36! 


0eB  aeC 


52  =     min     max     £y(p)2  -  £{[(y(/3)  -  x'a)*]2}   -  2a' Eq]      .  (37) 

BeB     aeC  I  > 

When  5  and  6  are  strictly  positive  and  the  parameterized  family  of  stochastic 
discount  factors  satisfies  the  appropriate  smoothness  and  moment 
restrictions,  an  extended  version  of  Proposition  2.2  can  be  obtained  for  the 
sample  analog  estimators  of  5  and  8.  Again  the  limiting  distribution  will  be 
the  same  as  if  the  solutions  to  the  population  optimization  problems  were 
known  a  priori. 

The  approach  can  be  extended  to  compare  the  smallest 
specification-errors  for  two  nonnested  families  of  models.  Such  a  comparison 
potentially  can  be  used  as  a  device  for  selecting  between  the  two  families  of 
models.  Vuong  (1989)  examined  a  very  similar  problem  by  using  the 
large-sample  behavior  of  likelihood  ratios  for  two  nonnested  families  of 
misspecified  models  (in  particular,  see  the  discussion  in  Section  5  of  Vuong 
1989);  and  we  can  imitate  and  adapt  his  analysis  to  our  problem.    More 
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with  two  such  families.   Take  the  null  hypothesis  to  be: 


(38) 


Under  the  null  hypothesis,  the  smallest  specification-error  associated  with 
each  parameterized  family  is  the  same.  As  a  consequence  the  performance  of 
the  two  parameterized  families  can  not  be  ranked  once  sampling  error  is 
accounted  for.  This  hypothesis  can  be  tested  by  using  the  corresponding 
distribution  theory  for  the  difference  between  the  analog  estimators  of  S 


4.   Asymptotic  Distribution  of  the  Multipliers 

Recall  that  the  asymptotic  distribution  f 
discussed  in  Sections  2  and  3  depends  only  on  the  population  solutions  to  the 
conjugate  problems  (8)  and  (9).  In  other  words,  sampling  error  in  the 
estimated  multipliers  does  not  contribute  to  the  limiting  distribution  of  the 
specification  error  bounds.  However,  as  elsewhere  in  econometric  practice, 
the  magnitude  of  the  multipliers  remain  interesting  in  their  own  right  as  a 
measure  of  the  importance  of  components  of  the  pricing  relations  to  the 
bounds.  Consequently,  it  is  advantageous  to  be  able  to  make  statistical 
inferences  about  their  magnitude. 

To  amplify  this  point,  one  use  of  the  asymptotic  distribution  for  the 
multiplier  estimators  is  to  test  whether  the  specification-error  or 
volatility  bounds  remain  the  same  when  a  subset  of  assets  is  omitted  from  the 
analysis.  A  special  case  of  such  a  test  is  a  region  subset  test  where  the 
question  of  interest  is  whether  given  an  initial  set  of  asset  returns, 
additional  asset  returns  result  in  an  increase  in  the  volatility  bounds.  It 
is  problematic  to  construct  a  test  directly  in  terms  of  the  difference 


between  a  measured  bound  computed  with  the  full  set  securities  and  the 
corresponding  bound  calculated  with  the  more  limited  array  of  securities 
because  the  resulting  statistic  has  a  degenerate  distribution  under  the  null 
hypothesis.  This  degeneracy  follows  from  the  fact  that  sampling  error  in  the 
multipliers  does  not  contribute  to  the  limiting  distribution  for  the  bounds 
and  is  circumvented  by  instead  basing  the  statistical  test  directly  on  the 
estimated  multipliers.  That  is,  it  is  advantageous  to  reformulate  the  null 
hypothesis  to  be: 

Ra  =  0,  or  R  a  =   0  (39) 

where  R  is  an  appropriately  constructed  selection  matrix  (see  below),  and  to 
use  the  limiting  distribution  for  the  estimated  multipliers  in  constructing 
an  asymptotic  chi-square  test. 

Several   versions   of   region  subset   tests  have  been  used   in   the 

g 
literature.    For  example,  Snow  (1991)  considered  the  small  firm  effect  by 

examining  whether  the  returns  on  small  capitalization  stocks  have  incremental 

importance  in  determining  the  volatility  bounds  over  and  above  the  returns  of 

large  capitalization  stocks.   Other  examples  can  be  found  in  Braun  (1991), 

Cochrane  and  Hansen  (1992),  De  Santis  (1993)  and  Knez  (1993). 

The  limiting  distribution  for  the  multipliers  shows  up  in  other 

applications  as  well.    For  instance,   testing  (39)   in  conjunction  with 

specification-error  analysis  is  helpful  in  ascertaining  which  assets  are 

important  contributors  to  model  misspecif ication.   Also,  when  a  researcher 

uses  the  specification-error  bounds  to  select  among  a  parameterized  family  of 

discount  factor  proxies,   it  is  desirable  to  make  inferences  about  the 

parameter  vector  chosen.   As  we  will  see  in  this  section,  the  asymptotic 

distribution  for  the  parameter  estimator  selected  in  this  fashion  interacts 
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with  the  limiting  distribution  for  the  estimated  multipliers.  As  a 
consequence,  our  characterization  of  the  multiplier  limit  distribution  is  an 
important  component  in  the  derivation  of  the  limiting  distribution  for 
estimators  of  parametric  stochastic  discount  factor  proxies. 

The  remainder  of  this  section  is  organized  as  follows.  In  Section  4. A 
the  distribution  theory  for  the  estimated  multipliers  is  developed  assuming 
that  there  are  no  assets  subject  to  short-sale  constraints.  In  Section  4.B 
we  comment  briefly  on  how  the  theory  can  be  extended  to  the  case  where 
short-sale  constraints  are  imposed  on  some  of  the  assets. 

4. A  Distribution  without  Market  Frictions 

In  the  absence  of  short-sale  constraints,  the  cone  C  is  Rn.  As  a 
consequence  the  estimation  problem  for  the  multipliers  is  posed  as  an 
unconstrained  maximization  problem  and  the  limiting  covariance  matrices  for 
the  asymptotic  distribution  of  the  coefficient  estimators  have  a  form  that  is 
familiar  both  from  M  estimation  [for  example,  see  Huber  (1981)]  and  from  GHH 
estimation  [for  example,  see  Hansen  (1982)].  The  only  complication  occurs 
when  considering  the  bounds  in  which  the  no-arbitrage  restriction  is  fully 
exploited  because  of  the  kink  induced  by  the  nonnegativity  restriction. 

The  moment  conditions  of  interest  are  given  by  the  first  order 
conditions  (10)  and  (11)  for  the  conjugate  maximum  problems: 

E[x{y-x'a)    -  q]    =   0,  (40) 

and 

E[x(y-x'a)+  -  q)    =   0  .  (41) 
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Y^[x(y-x'aT)    -  q]    =   0  (42) 

and 

Ej.txCy-x'a^)*  -  ql  =  0  .  (43) 

While  the  equations  for  a  are  linear,  those  for  a  are  nonlinear.  In  the 
latter  case,  we  use  a  linear  approximation  to  the  moment  conditions  in 
deriving  the  central  limit  approximation  for  the  parameters: 


x(y-x'a)  -  q     ~  x(y-x'a)  -  q  -   xx'  1  .    ,~   .(a-a)         (44) 


x(y-x'a)l{y_x,^0>  -  q 


Notice  that  the  function  of  a  on  the  left  side  of  (44)  is  differentiable 
except  at  values  of  a  such  that  y-x'a  =0.  We  assume  that  such  sample  points 
are  "unusual": 


Assumption     4.1:      Pr{y-x' a  =  0}  =  0. 


As  is  shown  in  Appendix  C,  Assumptions  1.1  and  4.1  are  sufficient  for  us  to 

study  the  asymptotic  behavior  of  the  estimator  {a  >  using  the  linearization 
9 


on  the  right  side  of  (44) 


£[x(y-x'ct)l.    ,~  ..  -  q]      =     0  .  (45) 

{y-x'aiO}   M 


To  use  linear  equation  system  (45)  to  identify  a,  the  matrix 
£(xx'l.  _  /~>ny)  must  be  nonsingular.  Given  Assumption  4.1,  this  rank 
condition  is  equivalent  to  Assumption  1.4  because  x  and  x  must  coincide  when 
no  short-sale  constraints  are  imposed.    The  counterpart   to  this  rank 
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condition  for  a  is  that  the  second  moment  matrix  E{xx' )  be  nonsingular  as 
required  by  Assumption  1.3. 

Working   with   the   two   linear   moment   conditions,   we   obtain   the 
approximations: 

•T(aT  -  a)      *     -[Eixx'l^.-^r^T^lxiy-x'*)*  -  q)  (46) 

/T(aT  -  a)  ~     -lEUx'  ))~1VT£ilxiy-x'a)    -  q] 

where  the  notation  ~  is  used  to  denote  the  fact  that  the  differences  between 
the  left  and  right  sides  of  (46)  converge  in  probability  to  zero.  Let  w  a 
[01].  Combining  approximations  (46)  with  Assumptions  2.2  and  2.3  gives  us 
the  asymptotic  distribution  of  the  analog  estimators. 

Proposition  4.1:  Suppose  Assumptions  1.1-1.3,  2.1  and  2.2  are  satisfied. 
Then  {v^T(a  -a)}  converges  in  distribution  to  a  normally  distributed  random 


Suppose  Assumptions  1.1-1.4,  2.1,  2.3  and  4.1  are  satisfied.  Then  Wl{a  -a)} 
converges  in  distribution  to  a  normally  distributed  random  vector  with  mean 
zero  and  covariance  matrix:  [Eixx'l.    _   /~>n»)l  wVw' [E{xx' 1.    _   /~>n\)l 


To  apply  these  limiting  distributions  in  practice  requires  consistent 
estimators  of  the  asymptotic  covariance  matrices.  The  terms  wVw'  and  vW 
can  be  estimated  using  one  of  the  spectral  methods  referenced  previously. 
Under  assumptions  maintained  in  Proposition  4.1,  the  matrices  Eixx' )  and 
E(xx'\.  _  /~>n»)  can  be  estimated  consistently  by  their  sample  analogs,  where 
the  estimator  a  is  used  in  place  of  a  in  estimating  the  second  of  these 
matrices. 
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We  now  briefly  consider  two  extensions  of  Proposition  4.1: 
(i)  Estimation  of  the  parameters  of  a  discount  factor  proxy.  Suppose 
that  the  discount  factor  proxy  depends  on  the  parameter  vector  0  and  let  0  be 
the  parameter  vector  that  minimizes  the  specification  error  when  positivity 
is  imposed.  Assume  that  the  parameterized  family  satisfies  the  appropriate 
smoothness  and  moment  restrictions,  and  that  0  is  in  the  interior  of  the 
parameter  space.   The  population  moment  conditions  are  given  by: 


E{x[y(0)-x'a]  -  q>   =  0;  (47) 

and 

e[—  (pXyW   -  [y(£)  -  a'x]  +  }}     =     0.  (48) 

^30  > 


The  distribution  theory  for  the  analog  estimators  {b  >  and  {a  >  of  0  and  a 
respectively  can  be  deduced  by  taking  linear  approximations  to  the  sample 
moment  conditions  (47)  and  (48)  and  appealing  to  the  results  of  Appendix  C. 

(ii)  Region  subset  tests.  Let  z  denote  an  (n-1 )-dimensional  vector  of 
assets  under  consideration  with  price  vector  s,  and  let  f  be  the 
k-dimensional  subvector  of  z  including  the  k-1  asset  payoffs  that  are  to  be 
used  to  construct  the  bound  augmented  by  a  unit  payoff.  Formally,  the  region 
subset  test  can  be  represented  as  the  hypothesis: 

£[z(f'9)+  -  s]      =0,  (49) 

El(f'e)*   -  v   ]   =  0. 


vector  of  securities  z  correctly.   One  possibility  is  to  test  this  hypothesis 
for  a  prespecified  v   ,  and  the  other  is  to  test  whether  it  is  satisfied  for 


our  previous  setup,  form  the  n-dimensional  vector  x  by  augmenting  z  with  a 
unit  payoff  and  form  q  by  augmenting  s  with  the  "price"  v  .  Then  hypothesis 
(49)  can  be  interpreted  as  a  zero  restriction  on  the  coefficient  vector  a 
employed  in  Sections  1,  2  and  4.  A.  The  components  of  coefficient  vector  a 
set  to  zero  correspond  to  the  entries  of  z  that  are  omitted  from  f.  A  large 
sample  Wald  test  of  this  zero  restriction  can  be  formed  by  applying  the 
limiting  distribution  in  Proposition  4.1. 

Alternatively,  we  could  estimate  the  parameter  vector  0  based  on  the 
overidentif ied  system  (49)  using  GMM.  The  analysis  leading  up  to  Proposition 
4.1  can  be  easily  modified  to  show  that  the  minimized  value  of  the  criterion 
function  is  distributed  as  a  chi-square  random  variable  with  n  -  k  degrees  of 
freedom  [see  Hansen  (1982)].   When  the  hypothesis  of  interest  is  altered  to 


freedom. 

4.B  Distribution  with  Market  Frictions 

We  now  briefly  describe  how  the  distribution  theory  is  modified  when 
some  short-sale  constraints  are  imposed  (C  is  a  proper  subset  of  IR  ) .  We 
will  focus  on  the  limiting  behavior  of  Vlia  -a),  but  the  results  for  Vt(a  -a) 
are  very  similar.  As  in  Section  1,  we  partition  x  by  whether  or  not  m  prices 
the  payoffs  with  equality  or  not,  that  is  by  whether 

Emx       =     Eq   ,  or  Emx     <  Eq       .  (50) 


strict  inequality  will  equal  zero  with  arbitrarily  high  probability  as  the 
sample  size  gets  large.  Hence  the  limiting  distribution  is  degenerate  for 
these  component  estimators. 

Consider  next  the  estimator  of  the  remaining  subvector  of  a,  which  we 
denote  7.  Because  of  the  degeneracy  just  described,  we  can,  in  effect,  treat 
the  limiting  distribution  of  the  estimator  of  y  separately.  Let  C  be  the 
lower-dimensional  cone  associated  with  estimating  y.  If  y  is  an  interior 
point  of  C,  then  the  argument  leading  up  to  Proposition  4.1  can  be  imitated 
to  deduce  a  limiting  normal  distribution  for  the  parameter  estimator. 
However,  if  y  is  at  the  boundary  of  the  cone  C,  the  limiting  distribution  may 
be  a  nonlinear  function  of  a  normally  distributed  random  vector  [see 
Haberman  (1989),  page  1645]. 10 

5.   Conclusions  and  Extensions 

In  this  paper  we  provided  statistical  methods  for  assessing 
asset-pricing  models  based  on  specification-error  and  volatility  bounds.  Two 
significant  advantages  of  the  statistical  methods  are  that  they  are  easy  to 
interpret  and  that  they  are  simple  to  implement  even  in  the  presence  of 
transactions  costs  and  short-sales  constraints.  The  results  can  be  used  in  a 
variety  of  ways.  For  example,  they  can  be  used  to  test  specific  models  of 
the  discount  factor,  to  examine  the  information  contained  in  different  sets 
of  asset-market  data,  and  to  assess  misspecified  asset-pricing  models. 

There  are  several  interesting  extensions  of  the  econometric  methods  in 
this  paper  including  the  following: 

(i)  The  short-sale  constraint  formulation  could  be  generalized  to  include 
"solvency  constraints"  whereby  portfolios  are  restricted  so  that  portfolio 
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payoffs  are  nonnegative.  This  amounts  to  imposing  a  form  of  borrowing 
constraint  on  consumers  [see,  for  example,  Hindy  (1993)  and  Luttmer  (1994)]. 
As  in  Section  1,  the  constraints  on  portfolio  weights  in  the  presence  of 
solvency  constraints  can  be  formulated  as  a  closed  convex  cone.  However, 
without  knowledge  of  the  distribution  of  the  payoffs  on  primitive  securities 
the  cone  would  be  subject  to  estimation  error  and  hence  not  directly  covered 
by  the  results  in  this  paper.  The  consistency  proof  in  Section  2.C  for  the 
arbitrage  bounds  might  well  be  adaptable  to  approximating  constraint  sets 
more  generally.  It  is  then  of  interest  to  understand  how  approximation 
errors  for  the  constraint  sets  impact  on  the  distribution  of  the 
specification  and/or  volatility  bounds. 

(ii)  A  difficult  feature  of  the  limiting  distribution  theory  for  the 
Lagrange  multipliers  in  the  presence  of  short-sales  constraints  (Section  4.C) 
is  the  manner  in  which  it  depends  on  the  true  parameter  vector  and  the 
associated  discontinuities.  This  feature  makes  the  distribution  theory 
harder  to  use  in  practice  and,  in  other  settings,  has  led  researchers  to 
compute  approximate  bounds  on  probabilities  of  test  statistics  [see,  for 
example  Wolak  (1991)  and  Boudoukh,  Richardson  and  Smith  (1992)].  Perhaps 
similar  probability  bounds  could  be  derived  for  the  region  subset  tests  with 
frictions. 

(iii)  Using  tools  similar  to  those  described  in  Section  1,  Chen  and  Knez 
(1995)  have  developed  nonparametric  measures  of  market  integration.  It 
should  be  possible  to  extend  the  econometric  methods  in  this  paper  to 
incorporate  transactions  costs  in  the  market  integration  measures  they 
propose. 
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Appendix  A:   Consistency 

In  this  appendix  we  demonstrate  formally  the  results  of  Section  2. A,  and 
2.D.   We  maintain  Assumptions  1.1-1.3,  and  2.1  throughout. 

Let  11  denote  a  compact  set  in  IRn.  For  any  subset  h  of  11,  we  let  cl(h) 
denote  the  closure  of  h.  Let  K  denote  the  collection  of  all  nonempty  closed 
subsets  of  11.      We  use  the  Hausdorff  metric  tj  on  X  given  by 


t)(/i  ,/i  )  =     max{   sup       inf    I  a  -  a  |,  sup       inf    I  a  -  a  |>  .  (A.l) 
a  eh  a  eh         a  eh  a  eh 
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to  define  notions  of  convergence  of  compact  sets.  For  some  of  our  results  we 
will  use  the  construct  of  a  lim  sup  of  a  sequence  in  K.  We  follow 
Hildenbrand  (1974)  and  define: 


Definition  A.l:      For  a  sequence  {h   >  in  11,    lim  sup  h       =  n  cl    (  u  h   ) 

J  J       ■»        J2*>  J 


Since  the  lim  sup  is  the  intersection  of  a  decreasing  sequence  of  closed 
sets,  it  is  closed  and  not  empty.  An  alternative  way  to  characterize  the  Jim 
sup  is  to  imagine  forming  sequences  of  points  by  selecting  a  point  from  each 
h  .  All  of  the  limit  points  of  convergent  subsequences  are  in  the  lim  sup, 
and,  in  fact,  all  of  elements  of  the  lim  sup  can  be  represented  in  this 
manner. 

We  shall  make  reference  to  an  implication  of  a  Corollary  on  page  30  of 
Hildenbrand  (1974)  that  characterizes  the  set  of  minimizers  of  an 
"approximating"  function  over  an  "approximating  set." 


3(3 


Lemma  A.l:      Suppose 


(i)       {0  }  is  a  sequence  of  continuous  functions  mapping  1/  into  R  that 


converges  uniformly  to  ip   ; 


and 


(ii)      {h  }  converges  to  h 


Then  Jim  sup  g     c  g      where  g     =    {u   e  h      :  \p    (u)  i  \p   (u' )  for  all  u'  €  h  }, 

J  °°        J         J     J       J  J 

and  iim  min  </<   =  min  0  . 

t  CO 

h   J  h 
J         °° 


Proof:  To  verify  that  this  follows  from  the  Corollary  in  Hildenbrand,  let  } 
denote  the  set  of  positive  integers  augmented  by  +oo,  and  endow  £  with  the 
usual  metric  for  a  one-point  compactif ication.  Then  in  light  of  (i),  the 
sequence  {</>  }  in  conjunction  with  0  defines  a  continuous  function  on  }  x  11; 
and  in  light  of  (ii),  the  sequence  {h  }  in  conjunction  with  h  defines  a 
continuous  compact  correspondence  mapping  }  into  Ii.  The  conclusion  of  the 
Lemma  Al  then  follows  from  the  Corollary  together  with  part  (ii)  of 
Proposition  1  of  page  22  in  Hildenbrand.  Q.E.D. 

Turning  to  the  result  in  Sections  1  and  2.  A,  we  first  establish  the 
compactness  of  the  set  of  solutions  to  the  conjugate  problems: 

Lemma   A.  2:   The  set  of  solutions  to  conjugate  maximization  problems  (8)  and 
(9)  are  compact. 
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Proof:  We  consider  the  case  of  problem  (9).  The  proof  for  the  case  of 
problem  (8)  is  similar.  The  set  of  solutions  is  closed  because  the 
constraint  set  is  closed  and  the  criterion  function  is  continuous. 
Boundedness  of  the  set  of  solutions  can  be  demonstrated  by  investigating  the 
tail  properties  of  the  criterion  function.  We  consider  two  cases:  directions 
<  for  which  £' x    is  negative  with  positive  probability  and  directions  C,    for 

which  <;'x  is  nonnegative.   To  study  the  former  case  we  take  the  criterion  in 

2 
(9)  and  divide  it  by  1  +  |a|  .   For  large  values  of  |a|  the  scaled  criterion 

is  approximately: 


-  E[(-x'0+2]    where  <  =  a/[l  +  |a|2]1/2  .  (A. 2) 


Hence  |<|  is  approximately  one  for  large  values  of  |a|.  Moreover,  C,' x  is  a 
payoff  in  P.  Consequently,  the  unsealed  criterion  will  decrease  (to  -co) 
quadratically  for  large  values  |a|. 

Consider  next  directions  C,  for  which  C,' x  is  nonnegative.  From 
Assumption  1.2  we  have  that  C,' Eq  must  be  strictly  positive  unless  C,' x  is 
identically  zero.  When  <;'x  =  0,  it  follows  from  Assumptions  1.2  and  1.3  that 
C' Eq   is  again  strictly  positive. 

For  directions  ^  for  which  the  payoff  C,' x   is  nonnegative,  we  study  the 

2  1/2 
tail  behavior  of  the  criterion  after  dividing  by  (1  +  |a|  )    ,  which  yields 

approximately  -  <'£q  for  large  values  of  |a|.   Hence  in  these  directions  the 

the  unsealed  criterion  must  diminish  (to  -»)  at  least  linearly  in  |a|  .   Thus 

in  either  case,  we  find  that  the  set  of  solutions  to  conjugate  problem  (9)  is 

bounded.  Q.E.D. 


We  now  formally  establish  Proposition  2.1: 


Proof  of  Proposition  2.1: 
We  treat  only  the  cc 
for  {d  >  is  very  similar.  Assumptions  1.1  and  2.1  imply  that  {Yiioc)} 
converges  almost  surely  to  E$(a)  for  each  aeC.  Since  for  each  T,  70  is 
concave  as  is  E$,  Theorem  10.8  of  Rockafellar  implies  that  {£>}  converges 
uniformly  on  any  compact  set  in  IR  .  Further  from  Lemma  A.  2,  the  set  of 
maximizers  of  E4>  is  bounded.  For  a  positive  number  N,  define  C  ■  {a  €  C  : 
|ot|  s  N>  and  D  =  {a  e  C  :  I  a  I  =  N>.  Then  C  and  D  are  compact.  By 
choosing  N  to  be  sufficiently  large  we  can  ensure  that  C  contains  all  of  the 
maximizers  of  E4>  over  the  constraint  set  C  and  that  none  of  the  maximizers 
are  in  D  .      Let  5  be  the  maximized  value  of  Ed>   over  D  .      Then  by  choice  of  N 

N  N  r         N  3 

we  have  that  5  <  6.  Since  {Y  ^}  converges  uniformly  to  E<j>  on  C  almost 
surely,  for  sufficiently  large  T,  the  maximizers  of  £_0  over  C  are  also  not 
in  D  .  By  the  convexity  of  C  and  concavity  of  Lf  it  follows  that  for 
sufficiently  large  T,  the  maximizers  of  T  4>  over  C  coincide  with  those  over 
C.  Consequently,  the  almost  sure  convergence  of  {d  }  to  5  follows  from  the 
almost  sure  uniform  convergence  of  {T  <p}   on  C  .      Q.E.D. 

We  now  turn  to  the  results  in  Section  2.C  and  investigate  the 
statistical  consistency  of  sample  analog  estimators  {£  >  and  {u  >  for  the 
arbitrage   bounds   A   and  v  .  Recall   that   the   arbitrage   bounds   are 

representable  as  solutions  to  linear  programming  problems.  Since  there  is  no 
natural  compact  set  for  the  choice  variables  in  these  problems,  we  must 
explore  "directions  to  infinity."  We  study  these  "directions"  using  a 
compactif ication  of  the  parameter  space. 

First  consider  any  a  e  C  such  that  a' x  £  -1  with  probability  one.  Then 
with  probability  one  a'x  £  -1  for  all  t  with  probability  one  and  (aTq) 


I         <   u-'Tji'     ^  follows  that  lim    sup    I         *   A      with  probability  one. 


To  construct  a  compact  parameter  space,  we  map  the  original  parameter 
space  for  each  problem  into  the  closed  unit  ball  in  Rn  which  we  denote  as  II. 
We  consider  explicitly  the  case  of  u  .  The  proofs  for  the  case  of  I  are 
completely  analogous  to  the  case  for  u  and  are  omitted. 

Notice  that  the  constraint  set  used  in  defining  v  can  be  represented  as 
the  set  of  all  a  €  C   satisfying  the  equation: 

£[(1  -  a'x)+]    =   0  .  (A. 3) 

As  in  the  proof  of  Lemma  A. 2,  we  map  the  parameter  space  into  the  unit  ball 
(with  a  slightly  different  transformation).  The  transformation  <  =  a/(l+|a|) 
maps  Rn  into  the  open  unit  ball.  To  compact  if  y  the  transformed  parameter 
space,  we  consider  adding  the  boundary  points  of  the  unit  ball.  Notice  that 
we  can  recover  the  original  parameterization  by  considering  the  inverse 
mapping: 

a  =     C/(l  "  |C|)  (A. 4) 

for  |<|  <  1.  Using  the  transformation  in  (A. 4),  instead  of  considering  those 
a's  that  satisfy  (A. 3)  we  consider: 

D*     =     {   <  e  llnC    |  £{[(1  -  |<|)  -  x'0  +  >   =  0  }  .  (A. 5) 

This  transformation  potentially  adds  solutions  to  (A. 3)  by  including  the 
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boundary  of  the  unit  ball.  The  potentially  problematic  values  of  C  are  those 
for  which  x' C,  £  0,  C,eC  and  |<|  =  1.  We  rule  this  out  by  limiting  attention 
to  values  of  <  in 


(A. 6) 


effect  by  focusing  on  £'s  in  D  we  are  eliminating  C,'  s  corresponding  to 
payoffs  with  "high"  prices.  This  does  not  cause  us  problems  because  we  are 
concerned  with  estimated  upper  arbitrage  bounds  that  are  too  low,  not  too 
high.  Also,  any  <  in  D  for  which  |<|  =  1  must  have  an  (average)  price  that 
is  nonpositive.   This  eliminates  the  troublesome  points  (directions)  from  D  . 


Lemma  A.  3:      Suppose  that  u  <  oo.   Then  lim  sup  D    n  D       c  D     n  D. 

Proof:  First  notice  that  since  Fq  converges  to  Eq  almost  surely,  then 
T)(D  ,D)  converges  almost  surely  to  0.  We  next  establish  that  lim  D  =  D  . 
To  do  this  we  first  show  that  T  [(1  -  |^|)  -  x'  C,]  converges  uniformly  to 
£[(1  -  |<|  )  -  x'C,]*   on  U.      Note  that  11   n  C   is  compact  and  that: 

e||[(1  -  IC1 1)  -  x'cj+  -  [d  -  K2D  -  *'V  +  l|  (A7) 

s   (1   +    (£|x|2)1/2)Kr  Cz|. 

This  is  sufficient  for  the  Uniform  Law  of  Large  Numbers  of  Hansen  (1982)  to 
apply.   Hence,  from  Lemma  A.l,  the  lim  sup   of  the  sequence  of  minimizers  of 


T  [(1  -  ICU  -  x'  C.)  over  11  n  C  is  contained  in  the  set  of  minimizers  of  £[(1 
-  I<l)  -  x' <;]* .  Since  v  <  oo,  the  set  D  is  not  empty  and  D  is  the  set  of 
minimizers  of  £[(1  -  |<|)  -  *'<]*.  With  probability  one,  any  point  in  D 
must  also  be  in  D     for  all  Til.   Since  D     is  separable,  a  common  probability 


Proof:      First  note  that 


u  =  mini   C'Ej-g/d-lCN  I   <  €  Dt  n  Dt  >  for  sufficiently  large  T,  (A. 8) 
and 


in  smaller  values  of  the  maximized  criterion.  For  instance,  suppose  the 
constraint  set  is  augmented  to  include  all  of  the  points  in  D  n  D.  Then 
Lemma  A. 2  implies  that  this  sequence  of  augmented  constraint  sets  converges 
to  D  n  D.      The  conclusion  then  follows  from  Lemma  A.l.  Q.E.D. 


Lemma  A.  5:      Suppose  that  v     =   oo.   Then  iu   >  diverges  with  probability  one. 


Proof:       Since  v     =   oo,  there  are  no  values  of  a  €  C    such  that  a'x  £  1  with 
o 

probability  one.   Consequently,  the  only  values  of  £  in  D     are  ones  for  which 


Kl  =  1.  We  consider  two  cases.  First  suppose  that  D  =  a.  The  uniform 
convergence  of  ^[(1  -  |<M  -  x' C,)*  to  £[(1  -  |<|)  -  x'  <]  +  implies  that  for 
sufficiently  large  T,  D  =  a  and  u  =  <x>.  Next  suppose  that  D  *  0.  Since 
there  are  no  arbitrage  opportunities  (Assumption  1.2),  C'Eq  >  0  for  any  C  *n 
D  such  that  ll^'xll  >  0.  Also,  Assumption  1.2  together  with  the  no-redundancy 
Assumption  1.3  imply  that  C,' Eq  >  0  for  any  <  in  D*  such  that  II<'jcII  =  0. 
Furthermore,  D     is  closed  implying  that 

e   =  inf{C,'Eq   :    C,   €  D* }    >   0  .  (A. 9) 

Since  {Y  q}  converges  to  Eq  almost  surely  and  D  converges  almost  surely  to 
D  ,  it  follows  from  Lemma  A.  1  that  with  probability  one  for  sufficiently 
large  T,  CI_<7  >  c/2  for  all  £  e  D  .  The  convergence  of  {D  >  to  D  coupled 
with  the  fact  that  all  elements  of  D  have  norm  one  then  implies  that  {u  > 
diverges  almost  surely.  Q.E.D. 

Taken  together.  Lemmas  A. 3,  A. 4  and  A. 5  imply  Proposition  2.3. 

Appendix  B:   Asymptotic  Distribution  of  the  Bounds  Estimators 

In  this  appendix  we  show  that  in  the  case  in  which  the  prices  of  the 
payoffs  are  constant,  the  asymptotic  distribution  of  the  estimated  bounds  can 
be  demonstrated  even  when  the  parameter  vector  is  not  uniquely  identified 
(even  when  Assumption  1.4  is  not  satisfied). 

Proof  of  Proposition  2.2: 

We  consider  the  case  of  d  .  The  case  of  d  is  similar.  Let  h  be  the 
set  of  maximizers  of  Y  4>   and  let  h     be  the  set  of  maximizers  of  £#.   For  each 


T,  let  a  be  a  measurable  selection  from  h      (see  Theorem  1  of  Hi ldenbrand 

(1974),  page  54).   Since  lim   sup   h     =   h      almost  surely  and  h  is  compact, 

there  is  a  sequence  {a  }  in  h   such  that  lim    \a  -a  \    =0  almost  surely  (see 

Appendix  A).   Further,  an  implication  of  Lemma  A. 1  of  Hansen  and  Jagannathan 

(1991)  is  that  all  a  e  h      result  in  the  same  random  variable  m  =  (y-a'x)  +  . 

Also,  the  complementary  slackness  condition  for  problem  (9)  implies  that  for 

+  +2 

a  €  h    ,    a' q   =  £{y(y-a'x)   -  (y-a'x)   >,  so  that  a' q    is  the  same  for  all  a  € 

h    .      As  a  result,  the  random  variable  0(a)  is  the  same  for  all  a  €  h    .      Now 

consider  the  decomposition  of  /IT  [ (d  )  -  5  ]  as  in  (23): 


v'T[(5t)2  -  I2}    =   /IX^U^  -  4>(ar)]    +  VT^[4>Ut)    -  E4>Ur)].  (B.l) 


As   in  relation    (26),    we  have: 

0     s     VTj^lila^    -  4>Ut)}  (B.2) 

s     V,TXT[(mx   -  q)    -   E[mx   -  q))-{a^   -  aj      . 


Since  \a   -a  |  converges  almost  surely  to  0,  the  result  follows.  Q.E.D. 


Appendix  C:   Asymptotic  Distribution  of  the  Multipliers 

In  this  appendix  we  consider  the  asymptotic  distribution  of  our 
estimator  of  the  Lagrange  multipliers  when  there  are  no  transaction  costs. 
We  begin  by  demonstrating  that  restrictions  used  in  Hansen  (1982)  can  be 
extended  along  the  lines  of  Pollard  (1985)  and  Pakes  and  Pollard  (1989)  to 
accommodate  "kinks"  in  the  functions  used  to  represent  the  moment  conditions. 
We  then  show  how  to  use  this  result  to  prove  Proposition  4.1. 

The  notation  used   in  our   initial   proposition  for  GMM     estimators 


conflicts  with  some  of  the  notation  used  elsewhere  in  the  paper.  We  let  0 
denote  the  parameter  vector  of  interest  and  0  any  hypothetical  point  in  the 
underlying  parameter  space  T.      The  parameter  space  is  restricted  to  satisfy: 


Assumption  C.l:    T   contains  an  open  ball  in  IR  about  0  . 


We  will  use  the  construct  of  a  random  function.   A  random  function  ip   maps  the 
set  of  sample  points  into  the  space  of  vector-valued  continuous  functions  on 
T.      We  require  that  ^(0)  be  an  n-dimensional  random  vector  for  each  0  in  T. 
We  also  consider  an  approximating  function 


that  is  linear  0.   The  composite  random  function  satisfies: 

Assumption  C.2:       {  (i/»  '  ,\Jj*'  ) '  }  is  stationary  and  ergodic  and  has  finite  first 
moments. 

We  now  specify  the  sense  in  which  \p     is  required  to  approximate  ip    .      The 
approximation  error  induced  by  using  is  \p     in  place  of  \p     is 

rt(0)  =  |0t(0)  -  0*O)I  • 

Define: 


Note  that  dmodA-)     is  monotone  in  6.   Therefore,  we  can  take  almost  sure 
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limits  as  5  declines  to  zero.   We  impose  the  following  restrictions  on  mod    . 


Assumption  C.3:      lim  dmod    (8)    =  0  almost  surely. 
5*0 


Assumption  C.4:      Eldmod AS)]    <  <*>   for  some  5  >  0. 


The  approach  adopted  in  Hansen  (1982)  is  to  restrict  the  modulus  of 
continuity  of  the  derivative  of  ifi  to  converge  almost  surely  to  zero  and  to 
have  a  finite  expectation  for  some  neighborhood  of  the  parameter.  It  follows 
from  the  Mean-Value  Theorem  that  restrictions  imposed  in  Hansen  (1982)  on  the 
local  behavior  of  ip     imply  Assumptions  C.3  and  C.4. 

We  use  Assumptions  C.3  -  C.4  to  study  the  sense  in  which  T^  is 
stochastically  differentiable.   Hence  look  at  the  approximation  error 

c(8)      =     sup  <\Zj?lt(P)   -   E^'OH/I/HM  :  ie-3ol<5,0*0o>   . 
By  the  Triangle  Inequality  we  have  that 

c(5)      £  ^.dmod(5) 
Thus  by  Assumptions  C.1-C.2,  we  have  that 


lim  lim  sup  c   (5)   ^  lim     Edmod{8)  (C.l) 

5*  o   t-x»  5*  o 


4b 


0. 


This  in  turn  implies  the  stochastic  differentiability  condition  in  Pollard 
(1985)  because  the  counterpart  to  c  (5)  in  Pollard's  condition  is  scaled  by 
•TI0-0  |/(1  +  /TI0-0  |),  which  is  less  than  one.   Also,  the  iterated  limit 

o  o 

in  (C.l)  implies  the  limit  taken  in  Pollard's  condition  because  c  is 
monotone  in  6.  The  differentiability  of  limiting  moment  function  £0  follows 
directly  from  Assumption  C.4.  Therefore,  70  ~  E,P  satisfies  the  stochastic 
differentiability  condition  with  derivative  at  0  given  by  7A-EA.  Since 
{*/!.}  is  stationary  and  ergodic,  {Y  A-£A>  converges  almost  surely  to  zero 
hence  the  derivative  is  asymptotically  negligible. 

Next  we  impose  a  global  identification  condition  on  the  approximating 
function  0  .  Since  the  approximation  of  0  by  0*  is  local,  this  condition 
can  also  be  viewed  as  a  local  identification  condition  on  the  original 
function  0  . 


This  rank  condition  on  the  derivative  together  with  the  stochastic 
differentiability  conditions  already  established  imply  the  equicontinuity 
condition  (iii)  in  Theorem  3.3  of  Pakes  and  Pollard  (1989)  (see  the 
discussion  on  page  1043  of  Pakes  and  Pollard). 


»A*(bT 


)  =  o 


combination  of  moment  conditions  to  be  used  in  estimation. 
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Assumption  C.6:      {b    }   converges  in  probability  to  6. 

Assumption    C.7:         {a   }  converges  in  probability  to  a  nonrandom  matrix  a 
where  a  £A  is  nonsingular. 

Finally,  to  obtain  a  limiting  distribution  for  {b   }  we  assume: 

Assumption     C.8:  {/T£_.0O  )>   converges   in  distribution  to  a  normally 

distributed  random  vector  with  mean  zero  and  nonsingular  covariance  matrix 


Sufficient  conditions  for  Assumption  C.8  can  be  obtained  using  martingale 
approximations  as  described  by  Gordin  (1969),  Hall  and  Heyde  (1980)  and 
Hansen  (1985).   This  condition  implies  that  E^O  )  is  equal  to  zero. 

The  following  extension  of  Theorem  3.1  in  Hansen  (1982)  is  now  a  direct 
consequence  of  Theorem  3.3  and  Lemma  3.5  in  Pakes  and  Pollard  (1989). 

Proposition    C.l:        Suppose  that  Assumptions  C.1-C.8  are  satisfied.    Then 

{V,T(b_-£  )>  converges  in  distribution  to  a  normally  distributed  random  vector 

with  mean  zero  and  covariance  matrix  [a  £(A,)]  aV  a   ' [E(A  ' )a  ] 

o   t     ooo     to 

Estimation  of  £A  follows  as  in  Hansen  (1982)  as  long  as  A  can  be  expressed  in 


finite  first  moment  for  some  5  >  0.   In  this  case,  {F  D(b  )>  converges  in 
probability  to  £A. 


Proof  of  Proposition  4.1: 

In  light  of  Proposition  C.l,  we  now  verify  that  our  approximation  in 
(44)  satisfies  Assumption  C.3.  Let  r(a)  denote  the  random  approximation 
error: 


r(a)      =      |x(y-x'a)(l.    ,  in,-l,    ,-\.nJI  •  (c2) 

J  {y-x'aiO>   {y-x  aiO} 


It  follows  from  the  Cauchy-Schwarz  Inequality  that 


r(«)  *    l'(ri'«)llli^(l0)Vio})l  (c-3) 

s   |xx'a  -  xx'a||(l,    ,  ny-l,        #~  „»)l 
{y-x'aaO>   {y-x'a^O} 


where  the  second  inequality  follows  because  |x'a  -  x'a|  dominates  |x(y-x'a)| 
whenever  y-x'a  and  y-x'a  have  opposite  signs.  Therefore,  the  random 
approximation  error  satisfies: 


r(a)/|a-a|  s      \x\2  (C.4) 


for  a  *  a  implying  that  the  modulus  of  differentiability 


dmodic)      =  sup{r(oc)/|a-a|  :  |a-a|<e,  for  a*a>  (C.5) 


2 
is  dominated  by  |x|    Combined  with  Assumption  1.1  this  implies  that  for  any 

positive  value  of  c,  £[dmod(E)]  is  finite.   As  c  *  0,  dmod(c)  goes  to  zero 

except  when  1,    ~  _,  =  1.   In  this  case  it  is  possible  to  choose  a  such 
e  {y-x  a=0} 

that  |a-a|  <  e  and  1,  _  ,    .  =  1  so  that  r(a)  =  |xx'|.   However  Assumption 
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4.1  implies  that  this  occurs  with  probability  zero 
converges  almost  surely  to  zero.  Q.E.D. 
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Formally  C     is  the  dual  cone  of  C. 


2 
A  weaker  version  of  this  restriction  would  replace  Eq    by  q.        In  effect, 

Assumption  1.3  does  more  than  eliminate  redundant  securities.    It  also 

precludes  cases  in  which  distinct  portfolio  weights  give  rise  to  the  same 

payoff,  with  possibly  different  prices  but  the  same  expected  prices. 

3 
Hansen  and  Jagannathan  (1993)  showed  that  the  least  squares  distance  between 

a  proxy  and  the  set  M   of  (possibly  negative)  stochastic  discount  factors  has 

an  alternative  pricing-error  interpretation:    Formally,  the  pricing-error 

interpretation  for  the  least  squares  problem  (6)  is 


inf      sup      \Emp  -  Eyp\ 
mzM     peP 
£p2=l 


and  for  (7)  is 


inf     sup      \Emp  -  Eyp\ 
mzM*   peH 
£P2=1 


where  H   is  the  set  of  payoffs  on  hypothetical  derivative  claims. 
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Assumption  2.1  could  be  weakened  in  a  variety  of  ways,  but  it  is  maintained 
for  pedagogical  simplicity.  More  generally,  we  might  imagine  that  the 
process  { (x. ' ,q. ' ,y. )}  is  asymptotically  stationary,  where  the  convergence  to 
the  stationary  distribution  is  sufficiently  fast  to  ensure  that  the  Law  of 
Large  Numbers  applies  to  averages  of  the  form  (17).  In  this  case,  the  joint 
distribution  of  (x'.q'.y)  is  given  by  the  stationary  limit  point  of  the 


The  Hausdorff  metric  is  usually  employed  for  compact  sets  to  ensure  that  the 
resulting  distance  is  finite.  Because  of  the  vertical  character  of  the 
regions  and  the  existence  of  finite  arbitrage  bounds,  the  Hausdorff  distance 
will  be  finite  even  though  the  sets  are  not  bounded.  The  Euclidean  distance 
in  (30)  could  be  replaced  by  the  square  root  of  a  quadratic  form  in  the 
differences  between  two  points  as  long  as  a  positive  weight  is  given  to  both 
dimensions. 

Even  if  hypothesis  (35)  is  satisfied,  the  sample  analog  may  be  infinite, 
making  implementation  problematic.  This  happens  when  the  sample  mean  is 
outside  the  estimated  arbitrage  bounds.  This  phenomenon  does  not  arise  for 
hypothesis  (34). 

7 
Burnside  (1994)  and  Cecchetti,  Lam  and  Mark  (1993)  developed  and  studied 

alternative  versions  of  the  volatility  bounds  tests  when  no  transactions 

costs  are  introduced.   The  test  used  by  Cochrane  and  Hansen  (1992)  abstracted 

from  positivity  and  can  be  formulated  equivalently  using  <p    in  (34).   See 

Burnside  (1994)  for  a  Monte  Carlo  comparison  of  various  volatility  tests. 
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The  impetus  for  this  work  was  the  econometric  discussion  in  an  unpublished 
precursor  to  this  paper:  Hansen  and  Jagannathan  (1988). 


9 
Our  formal  derivation  of  the  distribution  theory  uses  a  result  from  Pakes 

and  Pollard  (1989).   A  byproduct  from  our  analysis  in  the  appendix  is  a 

(modest)  weakening  of  the  assumptions  imposed  in  Hansen  (1982)  to  accommodate 

kinks  in  the  moment  conditions  used  in  estimation. 


Haberman  characterized  this  nonlinear  function  as  a  particular  projection 
onto  a  closed  convex  set  formed  by  translating  C  by  -y.  Although  Haberman 
(1989)  only  considers  the  case  in  which  the  data  are  iid,  his 
characterization  of  the  limiting  distribution  applies  more  generally  with  a 
covariance  matrix  replaced  by  a  spectral  density  matrix  at  frequency  zero. 
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